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Abstract 

Despite recent claims we argue that Renyi's entropy is an observable quantity. It is 
shown that, contrary to popular belief, the reported domain of instability for Renyi 
entropies has zero measure (Bhattacharyya measure). In addition, we show the 
instabilities can be easily emended by introducing a coarse graining into an actual 
measurement. We also clear up any doubts regarding the observability of Renyi's 
entropy in (multi-)fractal systems and in systems with absolutely continuous prob- 
ability density functions. 
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I. INTRODUCTION 

Thermodynamical or statistical concept of entropy, though deeply rooted in physics, is rigor- 
ously defined only for equilibrium systems or, at best, for adiabatically evolving systems. In fact, 
the very existence of the entropy in thermodynamics is attributed to Caratheodory's inaccessi- 
bility theorem [1] and the statistical interpretation behind the thermodynamical entropy is then 
usually provided via the ergodic hypothesis [2,3]. It is, however, highly non-trivial matter to find 
a proper conceptual ground for entropy of systems away from equilibrium, non-ergodic systems 
or equilibrium systems with "exotic" non-Gibbsian statistics (multifractals, percolation, polymers 
or protein folding provide examples). It is frequently said that entropy is a measure of disorder, 
and while this needs many qualifications and clarifications it is generally believed that this does 
represent something essential about it. Information theory might be then viewed as a pertinent 
mathematical framework capable of quantifying the "measure of disorder" . It is undoubted advan- 
tage of information theoretic approaches that whenever one can measure (or control) information 
one can also measure (or control) the associated entropy, as the latter is essentially an average 
information about a system in question [4,5]. 

In recent years there have been many attempts to extend the equilibrium concept of entropy 
to more generic situations applying various generalizations of the information theory. Systems 
with (multi-)fractal structure, long-range interactions and long-time memories might serve as 
examples. Among multitude of information entropies Shannon's entropy, Renyi entropies and 
Tsallis-Havrda-Charvat (THC) non-extensive entropies [6] have found utility in a wide range of 
physical problems. Shannon's entropy is known to reproduce the usual Gibbsian thermodynamics 
and is frequently used in such areas as astronomy, geophysics, biology, medical diagnosis and 
economics (for the latest developments in Shannon's entropy applications the interested reader may 
consult Ref. [7] and citations therein). Renyi entropies were conveniently applied, for instance, in 
multiparticle hadronic systems [8], fractional diffusion processes [9] or in multifractal systems [10]. 
THC entropy was recently used in a study of systems with strong long-range correlations and in 
systems with long-time memories [11]. 
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Despite the information theoretic origin there has been raised some doubt regarding the ob- 
servabihty of Renyi entropies [12]. Some authors went even as far as to claim that instabilities 
in systems with large number of microstates completely invalidate use of Renyi entropies in all 
physical problems [13]. This is rather surprising since Renyi's entropy is routinely measured in 
numerous situation ranging from coding theory and cryptography [14] (where it regulates the op- 
timality of coding), through chaotic dynamical systems [10] (where it determines the generalized 
dimensions for strange attractors) and earthquake analysis [15] (where it is used to evaluate the 
distribution of earthquake epicenters and lacunarity) to non-parametric mathematical statistics 
(where it prescribes the price of constituent information). Besides, Renyi entropies directly provide 
measurable bounds in quantum-information uncertainty relations [16]. 

In the present paper we aim to revise Lesche's condition of observability. We illustrate this in 
various contexts; systems with finite number of microstates, systems with infinite (but countable) 
number of microstates, systems with absolutely continuous probability density functions (PDF's) 
and multifractals. We show that it is not quite as simple to define the ubiquitous concept of 
observability. Wc propose a less restrictive observability condition and demonstrate that Renyi 
entropies are observable in this new framework. In what follows we will give some considerations 
in favor of the above statement. 

The paper is organized in the following way: In Section II we discuss Lesche's criterion of 
observability which frequently forms a core argument against observability of Renyi entropies. We 
argue that the criterion is unnecessary restrictive and, in fact, many standard physical phenomena 
which are observed and measured in a real world do not obey Lesche's condition. In Section III we 
present some essentials of Renyi entropies required in the main body of the paper. In Section IV A 
wc argue; that for the finite number of microstates Renyi entropies easily conform with Lc!sche"s 
criterion, i.e., they are observable. In Section IV B we extend our analysis to a countably infinite 
number of microstates. Here appearance of instabilities may be observed. The latter can be 
traced to a large sensitivity of Renyi entropies to (ultra) rare-event systems. We demonstrate that 
when the coarse graining is included into realistic measurements, the instabilities get "diluted" 
and Renyi entropies once again obey Lesche's condition. In Section IV C we propose more realistic 
criterion of observability where we allow for a certain amount of instability points, provided the 
latter ones have measure zero. To this extend wc employ Bhattacharyya statistical measure - 
i.e., natural measure on the space of non-parametric statistics. We prove that the Bhattacharyya 
measure of the above "critical" distributions is in fact zero. Finally, we analyze in Section V 
systems with continuous probability distributions and multifractal systems. We find that the very 
nature of the absolute continuity of PDF's and the multifractality prohibits per se an appearance 
of instability points. 

II. LESCHE'S CRITERION OF OBSERVABILITY 

In order to explain fully the apparent inconsistencies in the recent claims concerning non- 
observability of Renyi entropies we feel it is necessary to briefly review the main points of Lesche's 
observability criterion. While we hope to discuss all the salient points, a full discussion can be 
obtained in Lesche [12]. Our discussion will be in terms of a scalar quantity G(x). Following [12], 
a necessary condition for G{x) with the state-^ variable a; € AT C M" to be observable is following: 



'^Here and throughout, the state space X represents the sample space of mathematical statistics, i.e., 
the space over which the probability distributions operate. In simple situations this coincides with the 
set of all possible outcomes in some experiment. Generally, the elements of X can represent probability 
distributions themselves provided a suitable measure is defined. This fact will be used in Section IV. 
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Let 



k 

be the Holder Zi-metric on R", then for Ve > there exists (a;-independent) (5^ > such that for 
any pair x, x' one has 

^max 

From a strict mathematical standpoint (1) is, in fact, the definition of the uniform metric continuity 
of G{x) on the state space X. Informally Eq.(l) states that points from X which are close in 
sense of II ... Ill arc mapped via G to points which arc close in | . . . | metric. Lcschc's criterion is 
thus nothing but the condition of stability of G{x) under a measurement. In fact, the continuity 
criterion ensures that a small error in a state variable x will not bring in repeated experiments 
violent fluctuations in measured data. The uniform continuity in Eci.(l) is then a key ingredient 
to secure that the size of the changes in G(a;) depends only on the size of the changes in x but not 
on X itself. This condition excludes, for example, systems whose statistical fluctuations in G{x) 
would change too dramatically with a small change in the state variable x. 

When G{x) is bounded we can recast Lesche's condition of observability into equivalent but 

more expedient form, namely (inverse) Lipschitz continuity condition [17]. In this case, a quantity 
G : X C M" H- » R is observable in Lesche's sense if and only if for every e > there exists 
(a;-independent and finite) such that for any pair x, x' G X one has 

\G{x)~G{x')\<K,\\x~x'\\i+e. (2) 

We will practically employ the condition (2) in Section IV A. 

Criteria (1) and (2) get generalized in case when n ^ oo. This should be expected as the uniform 
continuity may not survive in the large n limit. To avoid such situations Lesche postulated that 
the mapping 

oo 

G:[jX^^R, (3) 

n=l 

with Xn C M", taken as a function of n, converges to an uniformly continuous function in an 
uniform manner, i.e., for Ve > there exists > such that for M x,x' & M" and Vn e Z+ 

,, ,,, ^ . \G{x)-G{x')\ 

\\x-x'\\i<5e ^ '-^^ ^<e. (4) 

The uniform convergence is then reflected in the fact that 5^ is both x and n independent. 

Let us add a couple of remarks concerning the aforementioned observability conditions. Lesche's 
condition, as illustrated above, is based on the notion of measurability. This is however not the 
only possible way how to define observability. It is well known that various alternative concepts 
exist in literature. For instance, one may use the approach based on distinguishability [18] or 
detectability [19]. In fact, the condition based on measurability, and namely the condition of 
uniform continuity, might be often to tight. Indeed, there arc clearly many quantities which arc 
not uniformly continuous in their state variables (e.g., they are discontinuous in finite number of 
points in the state space) and which are, nevertheless, perfectly detectable and well defined away 
from the singularity domain. Note, for instance, that although pressure and latent heat in 1st order 
phase transitions are discontinuous in temperature, and similarly susceptibility in 2nd order phase 
transitions is nonanalytic in temperature, there is still no reason to dismiss pressure, latent heat 
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and susceptibility as observables. Discontinuous or nonanalytic state functions are not exclusive to 
phase transitions only. Actually, such a type of behavior is common to many different situations 
- formation of shocks in nonlinear wave propagation, mechanical systems involving small masses 
and large damping, electric-circuit systems with large resistance and small inductance, catastrophe 
and bifurcation theories, to name a few. 

III. RENYI ENTROPIES 

Renyi entropies constitute a one-parametric family of information entropies labelled by Renyi's 
parameter a G K"*" and fulfill the additivity with respect to the composition of statistically in- 
dependent systems. The special case with a = 1 corresponds to ordinary Shannon's entropy. It 
might be shown that Rcnyi entropies belong to the class of mixing homomorphic functions [12] 
and that they are analytic for a's which lie in / U IV quadrants of the complex plane [20]. In 
order to address the observability issue it is important to distinguish three situations. 

A. Discrete probability distribution case 

Let X = {.Ti, . . . , a;„} be a random variable admitting n different events (be it outcomes of some 
experiment or microstates of a given macrosystem) , and let V = {pi, . . . be the correspond- 
ing probability distribution. Information theory then ensures that the most general information 
measures (i.e., entropy) compatible with the additivity of independent events are those of Renyi [4]: 

^"(^) = (r^^°s2(&^) • 

Form (5) is valid even in the limiting case when n — > oo. If, however, n is finite then Renyi entropies 
are bounded both from below and from above: log2(pfe)max < ^'q < logj n. In addition, Renyi 
entropies are monotonically decreasing functions in a, so namely J^^ < lce2 if ^'^d only if > a2- 
One can reconstruct the entire underlying probability distribution knowing all Renyi distributions 
via Widder-Stiltjes inverse formula [20]. In the latter case the leading order contribution comes 
from Xi('P), i.e., from Shannon's entropy. Typical playground of (5) is in a coding theory [21], 
cryptography [14] and in theory of statistical inference [4]. The parameter a might be then 
related with the price of constituent information. It should be admitted that in discrete cases 
the conceptual connection of T„(P) with actual physical problems is still an open issue. The 
interested reader can find some further practical applications of discrete Renyi entropies, for 
instance, in [20,22] 

B. Continuous probability distribution case 

Let M be a support on which is defined a continuous PDF T{x). We will assume that the 
support (or outcome space) can be generally a fractal set. By covering the support with the mesh 
M^'' of d-dimensional (disjoint) cubes M^'^ (fc = 1, . . . , n) of size Z"^ we may define the integrated 
probability in fc-th cube as 

Pnk=Hxy, Xi&M'^K (6) 

The latter specifies the mesh probability distribution Vn = {Pni, • • • ,Pnn}- Infinite precision of 
measurements (i.e., with I 0) often brings infinite information. In fact, it is more sensible to 
consider the relative information entropy rather than absolute one as the most "junk" information 
comes from the uniform distribution It was shown in [4,20] that in the n oo (i.e., I ^ 0) 
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limit it is possible to define finite information measure compatible with information theory axioms. 
This renormalized Renyi's entropy - negentropy (or information gain), reads 

U:f) . lunjiM - = log. i ^lXm^ ) ■ 

Here V is the corresponding volume. Eq.(7) can be viewed as a generalization of the Kullback- 
Leibler relative entropy [24]. It is possible to introduce a simpler alternative prescription as 

J„(jr) = lim {Ia{Vn) - Ia{Sn)\v=l) = Hm (X„(P„) + D l0g2 I) 



1 



(1-a) 



M 



log2 / dfiT'^ix)] . (8) 



In both previous cases the measure fj, is the Hausdorff measure [23] : 



kth box 



if < D 

oo if > _D , 



with D being the Hausdorff dimension of the support. Rcnyi entropies (7) and (8) arc defined if and 
only if the corresponding integral Jj^diiT"{x) exists. Eqs.(7) and (8) indicate that asymptotic 
expansion for la^Pn) has the form: 

Ic.{Vn) = -D\0g^l+I^{T) + 0{1) = -D\0g^l+i^{T)+\0g^Vn + 0{l). (9) 

Here Vn is the pre-fractal volume and the symbol o(l) is the residual error which tends to for 
Z ^ 0. In contrast to the discrete case, Renyi entropies la {T) are not positive here. 

Information measures ia{^) and Ia{^) have been so far mostly applied in theory of statistical 
inference [25] and in chaotic dynamical systems [10]. Let us note finally that one may view the 
discrete distributions as a special case of the continuous PDF's, provided the outcome space (or 
sample space) is discrete. In such a situation the Hausdorff dimension D is zero and Eq.(8) reduces 
directly to Eq.(5). 



C. Multifractal systems 

Multifractals can be viewed as statistical systems where both cells in the covering mesh and in- 
tegrated probabilities scale as some power of I. Grouping all the integrated probabilities according 
to their scaling exponents (Lipshitz-Holder exponents), say a, we effectively divide the support 
into the ensemble of intertwined unifractals each with its own fractal dimension /(a). Exponents 
/(a) are called singularity spectrum. In multifractal analysis it is customary to introduce yet 
another pair of scaling exponents, namely the correlation exponent T{a) which prescribes scaling 
of the partition function and "inverse temperature" a. These two descriptions are related via 
Legendre transformation 

T(a) = min(aa — /(a)) . (10) 

a 

As in the case of continuous PDF's the rcnormalization of Renyi entropies is required to extract 
relevant finite information - negentropy. It is possible to show that the following renormalized 
Renyi's entropy complies with the axiomatics of the information theory [20] : 

2a(M7') = lim {laiVn) - Ia{£n)\v=l) 

= lim (l«(P„) + ^log,/ 

1 . / /■ .(a). 



(1-a) 



log. / d/x^W • (11) 
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Here the multifractal measure is defined as [23] 



m= E f - { 

feth box 



Pnk i^q / if d < T{a) 
00 if d > r(a) . 



Renyi entropies Ia{liv) are defined if and only if the corresponding integrals J^d^^\a) exist. 
Eq.(ll) implies the following asymptotic expansion for laiVn) 



laiVn) = -D{a) l0g2 l+Iaif^v) + o(l) • 



Here 



Dia) 



T{a) 



lim 



[a - 1) i-o log2(l/Z) ' 



(12) 



(13) 



is the, so called, generalized dimension [23]. Note also that for systems of Subsection HIB D{a) 
is a independent. 

Let us stress that Renyi's entropy of multifractal systems is more convenient tool than the 

ordinary Shannon's entropy. It is possible to show that one can obtain Shannon's entropy for any 
unifractal by merely changing the Renyi parameter. In fact, Renyi's parameter coincides in this 
case with the singularity spectrum [20]. 



IV. OBSERVABILITY OF RENYI ENTROPIES: DISCRETE PROBABILITY 

DISTRIBUTION 

A. Finite case 



It is quite simple to sec that for systems with a finite number of outcomes (e.g., systems with 
finite number of microstates) Lesche's criterion of observability is fulfilled. The proof goes as 
follows^. We first use the inequality Iga; < x —1 and assume that ^^Pk — Sfc Ik' then 



This might be written in the invariant form as 



1 



< 



l-a| c{a,r,Q) 
1 



i=l 



|1 — a\ d{a, n) 



Here c{a, V, Q) = min {Y,i vt, Hi Qi) and 

d{a, n) — 



1 if < a < 1 
n^-" if a > 1 . 



(14) 



To find the efficient estimate for \}2^.{p% — 5^)1 in terms of HP— Qjji we utilize the following trick: 
Let us define the function 



^For simplicity's sake we use in tliis subsection a natural logarithm instead of logj. 
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(15) 



Here 9(. . .) is the Heavisidc step function and / : [a, b] i-^ [0, 1] is some invertible function. Both 
f{s), a and b wiU be chosen at the latter stage so as to facihtate our computations. Note also that 



max{0; (1 - n/(s))} < A{s,P) < 1 . 
Important property of ^(s,^) is the following straightforward inequality 



(16) 



\A{s,r) - Ais, Q)\ < J2 \iPk - f{s))0{Pk - f{s)) - {qk - f{s))0{qk - f{s))\ 

n 

<^bfc-*l = ll^-S||i, 



(17) 



k=l 

which is valid for any s G [a,b]. Note further that 

rb " ff(b) 



/ A{s,V) ds = J2 iPk- x)e{Pk - x) {r\x)) dx 

= l^e{pk - /(a)) (^{f{a)-pk)a + J^'^^ f-\x) dx 

+ o{Pk - f{b)) ({Pk - f{b))b + y^^'^ f-\x) dxj 1 



(18) 



Here we have used the fact that p^s must lie somewhere between /(a) and f{b). If we now chose 
f{x) = (a;/a)V(«-i) with 



oo if < a < 1 
^0 if a > 1 

(so /(a) = 0, f{b) = 1) we obtain 

rb 

{A{s,r)-A{s,Q))ds 
Applying (17) and (19) we may write for a > 1 

n 

T.(Pk-€) 



and b = a. 



(19) 



fe=i 



< 



|y n(s/a)i/(«-i) ds + £ \A{s,V)-A{s,Q)\ ds | 

< n(a-l)(c/a)"/"-i + (a-c)||7'-Q||i. (20) 
So if we take c = a(£/n")("-i)/" (this assures that A{s, V) > {I - nf{s)) > for s G (0, c]) then 

\Ia{V)-Xc,m < Ki'U\r-Q\\i+e, (21) 
with kP = ((n« /£)(«-!)/" - 1) £(«-!)/«. 

In case when < a < 1 we may utilize (16), (17) and (19) to obtain 



k=l 



< 



PC poo 

/ \A{s, V) - A{s, Q)\ds+ / n(s/a)^/("-^) ds 

J a J c 

< (5-a)||P-Q||i+n(l-a)(c/a)"/("-i) . 
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(22) 



By setting c = a(£/n)^" ^^Z" (this assures that A{s, V) > {1 — nf{s)) > for s e [c, oo)) we have 

\Mr)-Jc.{Q)\ <iffMl^-sili+e, (23) 

with Ki^^ = jT^iis/n)'^"^^^^" - I)- Note particularly that lim^^i^ kP = ln(n/e) and 

lima^i_ = ln(n/e). This indicates that the Lipschitz conditions (21) and (23) can be an- 
alytically continued to a = 1. This reconfirms the well known result that Shannon's entropy is 
Lipschitz. 

Finally note that Eqs.(21) and (23) represent the Lesche criterion (2). Hence, in cases when the 
state space corresponds to the space of all possible probability distributions assigned to a definite 
(finite) number of outcomes (microstates) Renyi entropies are measurable in Lesche's sense. 

B. Infinite limit case 

As it was already mentioned in Section II, the situation starts to be more delicate in the large 
n limit. This is because for the sake of uniform metric continuity at any n one might require that 
also the limiting case should obey the uniform continuity. To tackle statistical systems with a 
countable infinity of microstates^ we will illustrate first that by introducing a coarse graining into 
a realistic measurement, alleged Lesche's counterexamples do not apply. 

In his paper [12] Lesche proposed the following examples to demonstrate the non-observability 
of Renyi entropies. In a > 1 he picked up two distributions, namely {i = 1, . . . ,n) 

\\r^r'\h = 6. (24) 

Lesche then went on to show that these two distributions do not fulfill the uniform continuity in 
the large n limiting case. Let us now show that the coarse graining (which is naturally present in 
any realistic measurement) will restore the uniform continuity for the large n limit case. 

We will assume, for the simplicity's sake, that the discrete probability distributions (24) are 
living on the unit lattice with equidistantly distributed lattice (i.e., support) points. In the spirit 
of Lesche's paper we assume that the true probability distribution on the interval [0, 1] is obtained 
in n ^ oo limit (i.e., when the lattice spacing tends to zero). As usually, we will keep n ^ 1 
finite during calculations and set to infinity only at the very last stage. Because every actual mea- 
surements have a certain resolution capacity we will further assume that a realistic measurement 
can sample the unit interval through window of width 1/k (fc <C n) (so k windows will cover the 
support space). In this case one can know only integrated probabilities, hence V — > ■P(fe) and 
V — > 'P'rj.y As in every window there is n/k underlying pj's we have {i = 1, . . . ,k) 




^Such systems often appear in various physical situations. (Countable) Markov chains, Fermi-Pasta-Ulam 
lattice models or symbolic dynamical models being examples. 
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Using the fact that 

Xa(nfc))-^a(^(fc)) 



log2 k we have 



log2 



(1 - a) 
X (logsfc)-^ 



(i+^(f-i)r+(fc-i)(^r(fr. 



1 — a 1 — a 



log2 



/ l0g2 fc 

/ log2 k 



a (k-l) 
2 J 2 Ink 



(26) 



It is now simple to see that Lesche's condition is easily fulfilled, as for arbitrarily small e there 
exist 6e, namely 



(27) 



for which the metric proximity \\V(^k) ^ ^(fc)lli — ^£ implies the proximity of outcomes, i.e., 
^a{P{k)) ~^a('P(/j)) /log2A; < s. This result is clearly independent of n because whenever n 
is finite the outcome of the previous section applies and for n — > oo the validity has been just 
proven. 

We proceed analogously for a < 1. In this case Lesche's counterexamples were provided by two 
distributions {i = 1, . . . , n) 



V = {pi = 5ii} , 



\\V-V'\\, = 6. 

As before we can obtain integrated probability distributions which read (i = 1, . . . , fc) 

^;,.{,f).(:-0,,,_l_fg _,.)}, 

and so 



(28) 



(29) 



X (logsfc) ^ 



l0g2 



5 n{k-l) \ f5 n 

2k{^) ^> 



(1-a) 



6 k-iy ^JS 



2 fc(n- 1) 



/ log2 k 



2k 2 Ink ^ ' 



(30) 



Here the inequality 

x" -ax> 0, for x e [0, 1], a e [0, 1] , 

was used on the last line. Consequently we again see that for sufficiently small s there exist 6^, 
namely 

2k I 

Se < ^^V^ Hk)'/", (31) 

which satisfies Lcschc's condition. Note, that from (27) and (31) follows that our argument 
naturally includes also the case a = 1 (i.e., Shannon's entropy) as in all steps leading to (27) and 
(31) we have well defined limits a — > 1+ and a — > 1_, respectively. 



C. Region of instability 

In the previous section we have found that Lesche's counterexamples can be bypassed by in- 
troducing a coarsened resolution into a measurement process. Let us now show that even when 
the coarsening is not employed the Lesche instability points have zero measure in the space of 
all discrete infinite distributions - Bhattacharyya's measure [26] - and hence they do not affect a 
measurement in most practical situations. 

The key observation is that Lesche's counterexamples single out a very narrow class of probabil- 
ity distributions. In particular, they imply that when a > 1, only distributions with a high peak 
probabilities create problems. Similarly, in cases where a < 1 only distributions with an infinite 
number of microstates having a negligible overall probability exhibit a critical type of behavior. 
We now demonstrate that the above probability distributions have a very small relevance in actual 
measurement. For this purpose we remind the reader the concept of Bhattacharyya measure [26]. 

Suppose that is a discrete random variable with n different values, P„ is the probability space 
affiliated with X and V = {pi, . . . is a sample probability distribution from P„. Because V 
is non-negative and summable to unity, it follows that the square-root likelihood = ^/pi exists 
for alH = 1, . . . , n, and it satisfies the normalization condition 

n 
i=l 

We sec that ^ can be regarded as a Tinit vector in the Hilbert space W = R". Now. let r'(i) and 
denote a pair of probability distributions and ^^^^ and the corresponding elements in 
Hilbert space. Then the inner product 

1=1 i=l 

defines the angle (p that can be interpreted as a distance between two probability distributions. 
More precisely, if »S"~^ is the unit sphere in the n-dimensional Hilbert space, then </> is the 
spherical (or geodesic) distance between the points on determined by and Clearly, 

the maximal possible distance, corresponding to orthogonal distributions, is given by ^ = 7r/2. 
This follows from the fact that 4*^^^ and ^^^^ are non-negative, and hence they are located only on 
the positive orthant of S^~^ . Spherical geometry on then naturally induces the measure - 

Bhattacharyya measure. The corresponding geodesic distance (f> is the, so called, Bhattacharyya 
distance. We remark that the surface "area" of the orthant (5"^^)+, i.e., the volume of the 
probability space P„ is 
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1 r 7r"/2 

Vn-M ^ K-1 + ) = ^Jdn-= ^„_,^^„^ . (34) 

Bhattacharyya measure of any set A C (iS""^)"*" is then 

and so particularly the normalization /LtB(P„) = 1 holds. The reader may see that the Bhat- 
tacharyya measure is indeed a very natural concept. In fact, (35) implies that the latter is just 
the Haar measure on S'^~^. One could possibly adopt some another (not spherical) metric on 
the the probability space (5"~^)+, but because all non-singular metric measures are on com- 
pact manifolds equivalent (i.e., they differ only by finite multiplicative functions - Jacobians) the 
Bhattacharyya measure will be fully satisfactory for our purpose. Actually the exclusiveness of 
Bhattacharyya measure in non-parametric statistics was already emphasized, for instance, in [27]. 
Naturalness and simplicity of Bhattacharyya's measure has been also appreciated in various areas 
of physics and engineering ranging from quantum mechanics [28] to statistical pattern recognition 
and signal processing [29]. 



1. a > 1 case 



Let us now look at the Bhattacharyya measure of the family of Lesche's critical distributions 
corresponding to a > 1. In this case the relation (24) suggests that the critical distributions 
form the 1-parametric family of distributions parametrized by 5. Fig.l indicates that there are 
clearly n such families. In contrast to the orthant surface which has dimension D = n — 1, the 
countable set of line-like 1-parametric families has the topological dimension D = 1 and hence 
the Bhattacharyya measure of Lesche's critical distributions is plainly zero. 




FIG. 1. The family of Lesche's critical distributions (a>l). A statistical system can be represented by points 
^ on a positive orthant <S+ of the unit sphere S in a real Hilbert space Ti. 1-parametric families of Leshe 's critical 

distributions are then represented by arcs 7i(5) = ^iki^) = ^ ^ 2'*° ~^ i,^ ~ ^) ( ^n-'^i'° ) ' ''^ S 1, . . . , n; 5 G [0, 2] J. . 
Depicted example corresponds to S = S"^ . 



We wish to ask whether some extension of (24) might have the non zero measure. We will 
illustrate now that the answer is negative. In fact, we will show that with Bhattacharyya measure 
approaching 1 (in the limit of large n) all distributions P e Pn inevitably fulfil Lesche's condition 
(4). Inasmuch, all distributions which exhibit the critical behavior encountered in Ref. [12] have 
Hb ^ as n — !■ CX3. To prove this we employ the following isoperimetric inequality (also known as 
Levy's lemma) [30]. Let /: S^^"^ i-^- M be a iT-Lipshitz function, i.e., for any pair G S'^~^ 
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(36) 



Then 



(37) 



where /i is the Haar measure on 5" ^ and i? is an absolute (i.e., n-independent) constant whose 
precise form is not important here 

Let us choose /(^) = ||C||2a- Using the triangle inequality we have 

|C^^^|k-||C(^)||2a| < ll^^^^-^^^^lk < ll^^^^-^^'^l|2, (38) 

SO ||$||2a is 1-Lipshitz function. In addition, 

\\r - Qiii = E V - icPr\ = E I ^1'^ - 1 {^'+^') > w^''' - i'^'wi ■ (39) 

i i 

So particularly when two distributions are 5 close then their representative points on the sphere 
fulfil the inequality 



|£(1)|L -||£(2)|| 
Is \\2oL Ms M2a 



< Vs. 



(40) 



The next step is to calculate the mean /_5„-i / (^)rf/i. As it stands, this is quite difficult task but 
fortunately we may take advantage of the fact that 



n / |Cip«d/x(0 ■■ 
nr(n/2)r(a+ 1/2) r(a + 1/2)2" 



^/o"|cos(0)|2"(sin(0))"-2(;0 
X5„_,(sin((?))"-2d0 



^ r(n/2 + a) ^ 
(Note that (41) is true for all a > 0. ) Using Jensen's inequality we then have 



(41) 



^^(ll^lk) = / II^IMm 



/r(Q! + i/2)2« 1 1 

^ ' ' n^-^ . (42) 



On the other hand, because all distributions from P„ fulfill the condition 

n 



(43) 



we have that E (||^||2a) > J^^" ^ • Thus the mean value of |||||2« goes to zero as b ^n2^«> 2 j where 
b = b{n, a) is some bounded function of n. Collecting results (41) and (42) together we can recast 
Levy's lemma into form 

MB (I m\2a-Em\\2^)\ < C)>l-4e-''^'" 
^ fiBm\\2a-E{m2a)\<e[Em\\2a)r) > l-4exp(-^eVn(M^))) , (44) 



''The metric || . . . II2 appearing in the lemma represents the Euclidean distance inherited from R" (this 
is also called the chordal metric). Note that ll^*-^' — 4''^^||2 = 2sin((/)/2) < (j>, with (j> representing the 
Bhattacharyya distance. 
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for some e > 0. Note that due to symmetry of /(^) wc were allowed to exchanged in (37) the 
averaging over the surface of <S"~^ for the averaging over the positive octant (iS""^)"*". Result (44) 
implies that for any e > and any 1 < p < a/{a — 1) the inequalities 



(45) 



II^IU > Em\2o.)[l-e[E{\\i\\,^)r^) > £;(|||||2o)e-2<^(ll€IU)r 
llClk < Em\2a){l + e[Em\^^)r') < i;(||$||2a)el^^(ll«ll-)]'"' 



hold for almost all ^ G P„ (their Bhattacharyya measure is arbitrarily close to 1 as n increases). 
The fact that "well behaved" functions are at large n practically constant on almost entire sphere 
is known as the concentration measure phenomenon [30-32]. In passing, the reader may notice 
that the relation (44) is a variant of Bernstein-Hoeffding's large deviation inequality [31,33]. 



Using now Minkowski's triangle inequality 



1 2a 



1 2a 



< 



C^'^IU-i^dlCIU) + ||^^^^|U-i?(||^||2a) < 2e[£;(||4||2a)]^ 



j(2) 



and bearing in mind (40) we can chose V6>2e[E (||^||2a)]''- Consequently (for n > 3) 



\MT^)-MQ)\ 




(a - l)log2n 



6ae 



< 



2a 



(a-1) 



Ig 



[a — 1) [a — 1) \4^ 

Thus we see that one can always find an appropriate for every e, namely 

and so the observability condition (4) is satisfied in all cases for which inequalities (45) hold. 

2. < cx < 1 case 



6a 



(p-l)/2p 



(46) 



Similar analysis can be performed for critical distributions in the a < 1 case. The corre- 
sponding 1-parametric families of Lesche's critical distributions are represented by arcs q(5) = 

^fc((5) = ^(1 - f ) Sik + I ; A: G n; 5 e [0,2]|. These arcs are identical with arcs 7j((5) 

depicted in Fig.l only the orientation is reversed. Consequently the Bhattacharyya measure is 

again zero in this case. 

We may now ask whether there exists some generalization of (25) such that the corresponding 
measure /xb is non-zero. Answer is again negative. We show now that this is a consequence of 
the fact that almost all distributions V G P„, fulfill Lesche's observability condition (4), while 
Bhattacharyya's measure of those distributions which do not comply with the condition (4) tends 

to at large n. 

To prove this we utilize once again Levy's lemma. In this case we make identification /(^) = 
/£'(||^^^^||2a)- Similarly as in the previous case we must determine first the asymptotic 



t(2)| 



behavior of the mean £^(||^||2a)- This can be achieved by employing Jensen's inequality 



/r(a + 1/2)2° 




2a)2"dM < 



/ 



(48) 
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together with inequahty 



(49) 



i=l 



Therefore (||^||2a) is unbounded at large n and it approaches infinity as a(n2" 2) (a = a{n,a) 
is some function with lower and upper bound in n). Employing now the estimate: 



1 2a 



►(2) I 



1 2a 



<ll^ 



(1) 



►(2) I 



1 2a 



<ll^ 



(1) 



t(2)| 



2 



< Vs 



n2a 



(50) 



(where the triangle and Holder inequalities were successively applied) we obtain that /(^) is 1/a- 
Lipshitz. Here a is the lower bound^ of a. Levy's lemma then implies that 



ll^ll 



2a 



1 



< e > 1 - 4e 



-'0a€ n 



for any e > 0. Result (51) suggests that for a sufficiently small e (e < 1, 59 . . .) the inequality 



-2e 



< 1-e < 



i^(ll^l|2a) 



< 1 + e < 



(51) 



(52) 



holds for almost all ^ e P„ (/Ub — > 1 as n — > 00). So we again encounter the concentration 
of measure phenomenon - at large n almost all Bhattacharyya measure is concentrated on ^'s 
fulfilling the condition ||^||2a ~ -E-d |€| ba)- Using now 



1 2a 



< 



t(i)i 



2a 



^(ll^ll 



2a; 



t(2)| 



2a 



- 1 



2a) 



< 2e. 



(53) 



and bearing in mind (50) we can set 5 = Ae a . Consequently (for n > 3) 



\M'P)-MQ)\ 



2a 



(1 - a) log2 n 

6ea 



log; 



< 



< 



2a 



(1 — q) Ign 



-2e 



(1 — Q:)lgn (1 — Q:)a 



(54) 



As in the previous case we can conclude that it is always possible to find an appropriate 6^ for 
every s, namely 



4 < 



a(l — a)e 
3a 



(55) 



So the observability condition (4) is satisfied in all cases for which (52) holds. In passing we should 
mention that the underlying reason behind the relations (44) and (51) lies in the fact that n-spheres 
5" equipped with the Bhattacharyya distance (/)„ and Haar measure /i„ form the so called norm,al 
Levy family [30,34]. In can be shown [30] that the concentration measure phenomenon is an 
inherent property of any Levy family. 



The moral of this section can be summarized in the following way: whenever one selects as the 
state space for Renyi entropies the space of all discrete statistics then a non-uniform continuity 



^Clearlyo > Y ^'^+JZ^^'" > 0.529. 
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behavior (i.e., violation of Lesche's condition (4)) can be observed for a certain set of distribution 
functions in the hmit of large n. We demonstrated that the cardinality of such critical distributions 
is of zero Bhattacharyya measure in the space of all n ^ oo probability distributions. One may 
relate those zero measure distributions to the so called Za-bounded distributions (i.e., distributions 
whose la norm has non-zero lower bound for a > 1 and a finite upper bound for a < 1). This can 
be plainly seen from the fact that for Z^-bounded distributions the critical conditions (41) and 
(52) cannot be satisfied. 



V. OBSERVABILITY OF RENYI ENTROPIES: CONTINUOUS PROBABILITY 
DISTRIBUTIONS AND MULTIFRACTALS 

Let us briefly illustrate here that the conditions of absolutely continuous PDF's or multifractality 
are themselves sufficiently restrictive to ensure that the instabilities discussed in the previous 
section do not occur. To see this let us consider Eqs.(9) and (12). The latter imply that for any 
Vn and Vn for which the renormalized Renyi entropy exists the following identity holds 



-^a max 

- ' « " ' + o(l) . (56) 



£»(a)log2(l/0 

Superscript r denotes renormalized quantities. Note particularly that are by construction finite 
and n (i.e., I) independent. Using the fact that Iga; < (a; — 1) together with Holder inequality and 
Eq.(39) we have for two J-close distributions 



2ak 



\W-TM\ < .^l "wil^ir/"" ^ l^l "'li'i'^i'M ^S, (57) 
|l-a| mm(||^||2a) mm(||^||2a) 

with = l/lg2. Realizing that (9) and (12) imply 



(1-°) - 



1 1^1 Ua = e^^-^^") '+^o+o(i)) , (58) 

we can straightforwardly write that 

K-x;'l + o(i) < v^e^ira»a.-(x:wni+o(i) ^ j^^bVs. (59) 

Here B is an absolute constant representing the upper bound for the exponential. Gathering 
results (56) and (59) together we can finally write (for n > 2) 

\Ia{Pn)-Ia{K)\ < - I^'] + o(l) < BVS . (60) 

-^a max I -L 



It is then clear in this case that one can easily find an appropriate 5^ for every namely 

represents a correct choice. So for all pairs Vn and P'^ which lead in n — > oo limit to continuous 
PDF's (or multifractals) the Leshe condition (4) applies. It is therefore the very definition of 
systems with absolutely continuous PDF's/ multifractals (incorporated in Eqs.(8) and (11)) that 
naturally avoids the situations with instability points confronted in the previous section. 
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VI. CONCLUSIONS 



In this paper we have attempted to make sense of the recent claims concerning a total non- 
obscrvabihty of Rcnyi's entropy. We have found that problems have arisen from uncritical use of 
Lesche's observability criterion. We have proved that the latter criterion, as it stands, does not 
rule out observability of Renyi entropies in large class of systems. Systems with finite number of 
microstatcs or multifractals being examples. This is so because the structure of the space of distri- 
bution functions (or PDF's) over which such systems operate essentially prohibits an existence of 
"critical" situations considered by Lesche. In cases where such situations are encountered, namely 
in systems with (coimtablc) infinity of microstatcs, we argue that Lesche's uniform continuity 
condition is too tight to serve as a decisive criterion for the observability. 

In previous works the uniform continuity condition was used to force observability upon state 
functions. As we have shown, it is not just unnecessary to do this but it also causes the Lesche 
criterion to produce incorrect results in certain cases. By identifying the probability distribution 
with a state variable this has led to confusion about the observability of Rcnyi's entropy. Once the 
uniform continuity condition is dropped, we can clear up these confusing points. For this purpose 
we present a more intuitive concept of observability by allowing the quantity in question to have a 
certain amount of "critical" points provided that the cardinality of the critical points in the state 
space is of zero measure. 

It is definitely interesting to know what the "critical regions" correspond to. In case of Renyi 
entropies we offer a partial reply to this question. Namely, for systems with (countable) infin- 
ity of microstatcs we show that the critical regions correspond to the i5-vicinity of Zc«-bounded 
distributions. Basically such distributions correspond to (ultra)rare events which are frequently 
encountered e.g., in particle detection (double beta or tritium decays being examples). We have 
proved that the Bhattacharyya measure of these distributions must be zero. As Za-bounded dis- 
tributions are not existent in (coarse-grained) multifractals or in systems with continuous PDF's, 
neither in systems at thermal equilibrium, there is no a priori reason to disregard Renyi entropies 
as observable in the aforementioned instances. On the other hand, it is known that many systems 
undergo "statistics transitions" (stockmarket bidding and continuous phase transitions with their 
exponential-law - power-law distribution "transitions" may serve as examples) . It might be also 
expected that in dynamical systems away from equilibrium transitions to ^c«~bounded statistics 
may play a relevant role. In any case, one can turn the sensitivity of Renyi entropies to a virtue 
as it could be used as a diagnostic instrument for an analysis of (ultra)rare-event systems, simi- 
larly as, for instance, temperature sensitivity of the susceptibility is used as a diagnostic tool in 
continuous phase transitions. We believe that further investigation in this direction would be of a 
great value. 

Let us finally stress that there is also a conceptual reason why the observability a la Lesche 
should be viewed with some hint of scepticism. This is because the observability treated in such 
a framework is not a unique concept. Indeed, Lesche's condition can brand a quantity as ob- 
servable under one choice of state variables and as non-observable under a different choice, even 
if two such choices overlap in the scope of physical situations they describe. Typical example 
is the Gibbs-Shannon entropy. Here, according to the above criterion, the entropy is observ- 
able if the probability distribution is chosen as the state variable [13,12]. On the other hand, if 
temperature and pressure are state variables then entropy develops discontinuity in any system 
which undergoes first order phase transition (Clausius Clapeyron equation) and hence it is not 
for such systems uniformly continuous function of state variables, and according to (1) (or (2)) 
it is doomed to be non-observable. In this connection it is interesting to notice that because 
the parameter a plays formally role of inverse temperature [22,35] one may expect that various 
limits may not commute similarly as in Gibbsian statistical physics. Namely, we may anticipate 
that limc^i lim„^oo lim„^oo linia^i- In fact, Lesche [12] and other authors [13] applied the 
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sequence of limits lim„^oo limQ,_>i. In such a case they concluded that Renyi entropy of order one 
(Shannon's entropy) is observable while the rest of Renyi entropies is not (despite the fact that 
Rcnyi entropies arc analytic in a € K"*", see Rcf. [20]). On the other hand, when one utilizes the 
"thermodynamical" order, i.e., lim^^i lim„^oo) then also Renyi's entropy of order one develops 
instability points (this may be easily checked by noticing that unobservability argument presented 
in [12] is continuous in a = 1). The latter seems to support our previous comment that Shannon's 
entropy should not be uniformly continuous in the space of discrete distribution functions in order 
to account, for instance, for the first-order phase transitions. 
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